Fix IndexMetaData loads after rollover#33394
Conversation
When we rollover and index we write the conditions of the rollover that the old index met into the old index. Loading this index metadata requires a working `NamedXContentRegistry` that has been populated with parsers from the rollover infrastructure. We had a few loads that didn't use a working `NamedXContentRegistry` and so would fail if they ever encountered an index that had been rolled over. Here are the locations of the loads and how I fixed them: * IndexFolderUpgrader - removed entirely. It existed to support opening indices made in Elasticsearch 2.x. Since we only need this change as far back as 6.4.1 which will supports reading from indices created as far back as 5.0.0 we should be good here. * TransportNodesListGatewayStartedShards - wired the `NamedXContentRegistry` into place. * TransportNodesListShardStoreMetaData - wired the `NamedXContentRegistry` into place. * OldIndexUtils - removed entirely. It existed to support the zip based index backwards compatibility tests which we've since replaced with code that actually runs old versions of Elasticsearch. In addition to fixing the actual problem I added full cluster restart integration tests for rollover which would have caught this problem and I added an extra assertion to IndexMetaData's deserialization code which will trip if we try to deserialize and index's metadata without a fully formed `NamedXContentRegistry`. It won't catch if use the *wrong* `NamedXContentRegistry` but it is better than nothing. Closes elastic#33316
|
Pinging @elastic/es-core-infra |
|
I've not run this through much testing locally but i'm opening it so I can get CI to run. I'll remove the |
|
@elasticmachine, retest this please. |
| // sometimes the request comes in before the local node processed that cluster state | ||
| // in such cases we can load it from disk | ||
| metaData = IndexMetaData.FORMAT.loadLatestState(logger, NamedXContentRegistry.EMPTY, | ||
| metaData = IndexMetaData.FORMAT.loadLatestState(logger, namedXContentRegistry, |
There was a problem hiding this comment.
While I'm 99% sure this change is correct, I don't know of a way to trigger this code locally.
There was a problem hiding this comment.
I'm not sure I follow. What do you mean with triggering code locally?
There was a problem hiding this comment.
This branch looks like the kind of thing that won't kick in much. Do we have a test that'll call it?
There was a problem hiding this comment.
I think we only have general IT tests ci runs. This class is not wel tested :(
There was a problem hiding this comment.
I stuck an exception there and found RecoveryFromGatewayIT.testStartedShardFoundIfStateNotYetProcessed when I ran the test. So it looks like we do cover it. So I feel good!
| // sometimes the request comes in before the local node processed that cluster state | ||
| // in such cases we can load it from disk | ||
| metaData = IndexMetaData.FORMAT.loadLatestState(logger, NamedXContentRegistry.EMPTY, | ||
| metaData = IndexMetaData.FORMAT.loadLatestState(logger, namedXContentRegistry, |
There was a problem hiding this comment.
While I'm 99% sure this change is correct, I don't know of a way to trigger this code locally.
|
Marking |
|
|
||
| @Override | ||
| public IndexMetaData fromXContent(XContentParser parser) throws IOException { | ||
| assert parser.getXContentRegistry() != NamedXContentRegistry.EMPTY |
| // sometimes the request comes in before the local node processed that cluster state | ||
| // in such cases we can load it from disk | ||
| metaData = IndexMetaData.FORMAT.loadLatestState(logger, NamedXContentRegistry.EMPTY, | ||
| metaData = IndexMetaData.FORMAT.loadLatestState(logger, namedXContentRegistry, |
There was a problem hiding this comment.
I'm not sure I follow. What do you mean with triggering code locally?
When we rollover and index we write the conditions of the rollover that the old index met into the old index. Loading this index metadata requires a working `NamedXContentRegistry` that has been populated with parsers from the rollover infrastructure. We had a few loads that didn't use a working `NamedXContentRegistry` and so would fail if they ever encountered an index that had been rolled over. Here are the locations of the loads and how I fixed them: * IndexFolderUpgrader - removed entirely. It existed to support opening indices made in Elasticsearch 2.x. Since we only need this change as far back as 6.4.1 which will supports reading from indices created as far back as 5.0.0 we should be good here. * TransportNodesListGatewayStartedShards - wired the `NamedXContentRegistry` into place. * TransportNodesListShardStoreMetaData - wired the `NamedXContentRegistry` into place. * OldIndexUtils - removed entirely. It existed to support the zip based index backwards compatibility tests which we've since replaced with code that actually runs old versions of Elasticsearch. In addition to fixing the actual problem I added full cluster restart integration tests for rollover which would have caught this problem and I added an extra assertion to IndexMetaData's deserialization code which will trip if we try to deserialize and index's metadata without a fully formed `NamedXContentRegistry`. It won't catch if use the *wrong* `NamedXContentRegistry` but it is better than nothing. Closes #33316
When we rollover and index we write the conditions of the rollover that the old index met into the old index. Loading this index metadata requires a working `NamedXContentRegistry` that has been populated with parsers from the rollover infrastructure. We had a few loads that didn't use a working `NamedXContentRegistry` and so would fail if they ever encountered an index that had been rolled over. Here are the locations of the loads and how I fixed them: * IndexFolderUpgrader - removed entirely. It existed to support opening indices made in Elasticsearch 2.x. Since we only need this change as far back as 6.4.1 which will supports reading from indices created as far back as 5.0.0 we should be good here. * TransportNodesListGatewayStartedShards - wired the `NamedXContentRegistry` into place. * TransportNodesListShardStoreMetaData - wired the `NamedXContentRegistry` into place. * OldIndexUtils - removed entirely. It existed to support the zip based index backwards compatibility tests which we've since replaced with code that actually runs old versions of Elasticsearch. In addition to fixing the actual problem I added full cluster restart integration tests for rollover which would have caught this problem and I added an extra assertion to IndexMetaData's deserialization code which will trip if we try to deserialize and index's metadata without a fully formed `NamedXContentRegistry`. It won't catch if use the *wrong* `NamedXContentRegistry` but it is better than nothing. Closes #33316
6.4 has a bad bug where it won't if any of the shards on the node have been rolled over. This documents that in the known issues for 6.4.0 and links to the fix in the bug fixes section of 6.4.1. Relates to elastic#33394
6.4 has a bad bug where it won't if any of the shards on the node have been rolled over. This documents that in the known issues for 6.4.0 and links to the fix in the bug fixes section of 6.4.1. Relates to #33394
|
nice job! |
When we rollover and index we write the conditions of the rollover that
the old index met into the old index. Loading this index metadata
requires a working
NamedXContentRegistrythat has been populated withparsers from the rollover infrastructure. We had a few loads that didn't
use a working
NamedXContentRegistryand so would fail if they everencountered an index that had been rolled over. Here are the locations
of the loads and how I fixed them:
indices made in Elasticsearch 2.x. Since we only need this change as far
back as 6.4.1 which will supports reading from indices created as far
back as 5.0.0 we should be good here.
NamedXContentRegistryinto place.NamedXContentRegistryinto place.index backwards compatibility tests which we've since replaced with code
that actually runs old versions of Elasticsearch.
In addition to fixing the actual problem I added full cluster restart
integration tests for rollover which would have caught this problem and
I added an extra assertion to IndexMetaData's deserialization code which
will trip if we try to deserialize and index's metadata without a fully
formed
NamedXContentRegistry. It won't catch if use the wrongNamedXContentRegistrybut it is better than nothing.Closes #33316